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SHORT GCG EXPANSIONS IN THE PAB XX GENE FOR OCUIiO- 
PHARYNGEAL MUSCULAR DYSTROPHY AND DIAGNOSTIC THEREOF 

BACKGROUND OF THE INVENTION 

5 (a) Field of the Invention - 



uses thereof for the diagnosis, prognosis and treatment 
of a disease related with protein accumulation in 
nucleus; such as oculopharyngeal muscular dystrophy. 

10 (b) Description of Prior Art 



trophy (OPMD) is an adult -onset disease with a world- 
wide distribution. It usually presents in the sixth 
decade with progressive swallowing difficulties (dys- 

15 phagia) , eye lid drooping (ptosis) and proximal limb 
weakness. Unique nuclear filament inclusions in skele- 
tal muscle fibers are its pathological hallmark (Tome, 
F.M.S. & Fardeau, Acta Neuropath. 49^, 85-87 (1980)). We 
isolated the poly (A) binding protein II (PAB II) gene 

20 from a 217 kb candidate interval in chromosome 14qll. A 
(GCG) 6 repeat encoding a polyalanine tract located at 
the N-terminus of the protein was expanded to (GCG) 8-13 
in the 144 OPMD families screened. More severe pheno- 
types were observed in compound het^rozygotes for the 

25 (GCG) 9 mutation and a (GCG) 7 allele found in 2% of the 
population, whereas homozygosity for the (GCG) 7 allele 
leads to autosomal recessive OPMD. Thus the (GCG) 7 
allele is an example of a polymorphism which can act as 
either a modifier of a dominant phenotype or as a 

30 recessive mutation. Pathological expansions of the 
polyalanine tract may cause mutated PAB II oligomers to 
accumulate as filament inclusions in nuclei. 



a tool for the diagnosis, prognosis and treatment of a 
35 disease related with polyalanine accumulation in 
nucleus, such as oculopharyngeal muscular dystrophy. 




The invention relates to PAB II ' gene, and its 



Autosomal dominant oculopharyngeal muscular dys- 



It would be highly desirable to be provided with 



wo 99/29896 



- 2 - 



PCT/CA98/01133 



SUMMARY OF THE INVENTION 

One aim of the present invention is to provide a 
tool for the diagnosis, prognosis and treatment of a 
5 disease related with polyalanine _ accumulation in 
nucleus, such as oculopharyngeal muscular dystrophy. 

In accordance with the present invention there 
is provided a human PAB II gene containing transcribed 
polymorphic GCG repeat, which comprises a sequence as 
10 set forth in Fig. 4, which includes introns and flank- 
ing genomic sequence. 

The allelic variants of GCG repeat of the human 
PAB II gene are associated with a disease related with 
protein accumulation in nucleus; such as polyalanine 
15 accumulation, or with a disease related with swallowing 
difficulties, such as oculopharyngeal muscular dystro- 
phy. 

In accordance with the present invention there 
is also provided a method for the diagnosis of a dis- 
20 ease with protein accumulation in nucleus, which com- 
prises the steps of: 

a) obtaining a nucleic acid sample of said patient; 
and 

b) determining allelic variants ^f GCG repeat of 
25 the gene of the human PAB II gene, and wherein 

long allelic variants are indicative of a dis- 
ease related with protein accumulation in 
nucleus, such as polyalanine accumulation and 
oculopharyngeal muscular dystrophy. 
30 The long allelic variants have from about 245 to 

about 2 63 bp in length. 

In accordance with the present invention there 

is also provided a non-human mammal model for the PAB 

II gene of the human PAB II gene, whose germ cells and 
35 somatic cells are modified to express at least one 

allelic variant of the PAB II gene and wherein said 
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allelic variant of the PAB II being introduced into the 
mammal, or an ancestor of the mammal, at an embryonic 
stage . ~_ 

In accordance with the present invention there 
5 is also provided a method for the screwing of thera- 
peutic agents for the prevention and/or treatment of 
oculopharyngeal muscular dystrophy, which comprises the 
steps of : 

a) administering said therapeutic agents to the 
10 non-human mammal of the present invention or 

oculopharyngeal muscular dystrophy patients; and 

b) evaluating the prevention and/or treatment of 
development of oculopharyngeal muscular dystro- 
phy in said mammal or said patients. 

15 In accordance with the present invention there 

is also provided a method to identify genes part of or 
interacting with a biochemical pathway affected by PAB 
II gene, which comprises the steps of: 

a) designing probes and/or primers using the hGTl 
20 gene of the PAB II gene and screening oculopha- 
ryngeal muscular dystrophy patients samples with 
said probes and/or primers; and 

b) evaluating the identified gene role in oculopha- 
ryngeal muscular dystrophy patients. 

25 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figs. lA-B illustrate the positional cloning of 
the PAB II gene; 

Figs. 2A-G illustrate the OPMD (GCG)n expansion 

30 sizes and sequence of mutations (SEQ ID NOS:l-2); 

Fig. 3 illustrates the age distribution of swal- 
lowing time (st) for French Canadian OPMD carriers of 
the (GCG) 9 mutation; and 
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Fig. 4 illustrates the nucleotide sequence of 
human poly (A) binding protein II (hPAB II) (SEQ ID 
NO:3) . 

5 DETAILED DESCRIPTION OF THE INVENTION " ' ^ 

In order to identify the gene mutated in OPMD, 
we constructed a 350 kb cosmid contig between flanking 
markers D14S990 and D14S1457 {Fig. lA) . Positions of 
the PAB II selected cDNA clones in relation to the 

10 KcoRI restriction map and the Genealogy-based Estimate 
of Historical Meiosis (GEHM) -derived candidate interval 
(Rommens, J.M, et al . , in P:troceedings of the third 
international workshop on the identification of tran- 
scribed sequences (eds. Hochgeschwender , U. & Gardiner, 

15 K.) 65-79 (Plenum, New York, 1994 )) . 

The human poly (A) binding protein II gene (PAB 
II) is encoded by the nucleotide sequence as set forth 
in Fig. 4 . 

Twenty- five cDNAs were isolated by cDNA selec- 

20 tion from the candidate interval (Rommens, J.M. et al . , 
in Proceedings of the third international workshop on 
the identification of transcribed sequences (eds. 
Hochgeschwender, U. & Gardiner, K.) 65-79 (Plenum, New 
York, 1994) ) . Three of these hybridized to a common 20 

25 kb EcoRI restriction fragment and showed high sequence 
homology to the bovine poly (A) binding protein II 
gene (bPAB II) (Fig. lA) . The PAB II gene appeared to be 
a good candidate for OPMD because it mapped to the 
genetically defined 0-26 cM candidate interval in 14qll 

30 (Fig. lA) , its mRNA showed a high level of expression 
in skeletal muscle, and the PAB II protein is exclu- 
sively localized to the nucleus (Krause, S. et al,, 
Exp. Cell Res. 214, 75-82 (1994)) where it acts as a 
factor in mRNA polyadenylation (Whale, E., Cell 66, 

35 759-768 (1991); Whale, E. et al . , J. Biol. Chew. 268, 
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2937-2945 (1993); Bienroth, S. et al . , EMBO J. 12, 585- 
594 (1993) ) . 

We subcloned a 8 kb Hindi I I genomic fragment 
containing the PAB II gene, and sequenced 6002 bp 
5 (GenBank: AF026029) (Nemeth, A. et al .'>^i^ruc2eic Acids 
Res. 23. 4034-4041 (1995)) (Fig. IB) • Genomic structure 
of the PAB JJ gene, and position of the OPMD (GCG)n 
expansions. Exons are numbered. Introns 1 and 6 are 
variably present in 60% of cDNA clones. ORE, open read- 

10 ing frame; cen, centromere and tel, telomere. 

The coding sequence was based on the previously 
published bovine sequence (GenBank: X89969) and the 
sequence of 31 human cDNAs and ESTs . The gene is com- 
posed of 7 exons and is transcribed in the cen-qter 

15 orientation (Fig. IB) . Multiple splice variants are 
found in ESTs and on Northern blots (Nemeth, A. et al . , 
Nucleic Acids Res, 23, 4034-4041 (1995)). In particu- 
lar, introns 1 and 6 are present in more than 60% of 
clones (Fig- IB) ( Nemeth, A. et al . , Nucleic Acids Res. 

20 23./ 4034-4041 (1995)). The coding and protein sequences 
are highly conserved between human, bovine and mouse 
(GenBank: U93050) . 93% of the PAB II sequence was read- 
ily amenable to RT-PCR- or genomic-SSCP screening. No 
mutations were uncovered using both techniques. How- 

25 ever, a 4 00 bp region of exon 1 containing the start 
codon could not be readily amplified. This region is 
80% GC rich. It includes a (GCG) 6 repeat which codes 
for the first six alanines of a homopolymeric stretch 
of 10 (Fig. 2G) . Nucleotide sequence of the mutated 

30 region of PAB XJ. Amino acid sequences of the N- termi- 
nus polyalanine stretch and position of the OPMD ala- 
nine insertions . 

Special conditions were designed to amplify by 
PGR a 242 bp genomic fragment including this GCG- 

35 repeat. The (GCG) 6 allele was found in 98% of French 
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Canadian non-OPMD control chromosomes, whereas 2% of 
chromosomes carried a (GCG) 7 polymorphism (n=86) 
(Brais, B. et al . , Hum, Mol. Genet:. 4, 429-434 (1995)). 

Screening OPMD cases belonging to 144 families 
5 showed in all cases a PGR product largeir-^y 6 to 21 bp 
than that found in controls (Fig. 2A) . (GCG) 5 normal 
allele (N) and the six different (GCG)n expansions 
observed in 144 families. 

Sequencing of these fragments revealed that the 
10 increased sizes were due to expansions of the GCG 
repeat (Fig. 2G) . Fig. 2F shows the sequence of the 
(GCG) 9 French Canadian expansion in a heterozygous par- 
ent and his homozygous child. Partial sequence of exon 
1 in a normal (GCG) 5 control (N) , a heterozygote (ht . ) 
15 and a homozygote. (hm.) for the (GCG) 9-repeat mutation. 
The number of families sharing the different (GCG) n- 
repeats expansions is shown in Table 1. 

Table 1 

Number of families sharing the different dominant (GCG)„ 



OPMD mutations 


Mutations 


Polyalanine 


Families 


(GCG) 8 


12 


4 


(GCG), 


13 


99 


(GCG)„ 


14 


19 


(GCG) „ 


15 


16 


(GCG) 12 


16 


5 


(GCG)„ 


17 


1 


Total 




144 



t, 10 alanine residues in normal PAB II. 

The (GCG) 9 expansion shared by 70 French Cana- 
dian families is the most frequent mutation we observed 
25 (Table 1). The (GCG) 9 expansion is quite stable, with a 
single doubling observed in family F151 in an estimated 
598 French Canadian meioses (Fig. 2C) . The doubling of 
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the French Canadian (GCG) 9 expansion is demonstrated in 
Family F151 . 

This contrasts with the unstable nature of pre- 
viously described disease-causing triplet-repeats 
5 (Rosenberg, R.N., New Eng. J. Med.'-^35, 1222-1224 
(1996) ) . 

Genotyping of all the participants in our clini- 
cal study of French Canadian OPMD provided molecular 
insights into the clinical variability observed in this 

10 condition. The genotypes for both copies of the PAB II 
mutated region were added to an anonymous version of 
our clinical database of 176 (GCG) 9 mutation carriers 
(Brais, B. et al . , Hum. Mol , Genet:. 4, 429-434 (1995)). 
Severity of the phenotype can be assessed by the swal- 

15 lowing time (st) in seconds taken to drink 80 cc of 
ice-cold water (Brais, B. et al . , Hum. Mol . Genet. 4 , 
429-434 (1995); Bouchard, J. -P. et al . , Can. CT. Neurol. 
Sci. 19, 296-297 (1992)). The late onset and progres- 
sive nature of the muscular dystrophy is clearly illus- 

20 trated in heterozygous carriers of the (GCG) 9 mutation 
(bold curve in Fig. 3) when compared the average st of 
control (GCG) 6 homozygous participants (n=76, thinner 
line in Fig. 3) . The bold curve represents the average 
OPMD st for carriers of only one copy of the (GCG) 9 

25 mutation (n=169) , while the thinner line corresponds to 
the average st for (GCG) 6 homozygous normal con- 
trols (n=76). The black dot corresponds to the st value 
for individual VIII. Roman numerals refer to individual 
cases shown in Figs. 2B, 2D and discussed in the text. 

30 Genotype of a homozygous (GCG) 9 case and her parents 
(Fig. 2B) . Independent segregation of the (GCG) 7 
allele. Case V has a more severe OPMD phenotype 
(Fig. 2D). 

Two groups of genotypically distinct OPMD cases 
35 have more severe swallowing difficulties. Individuals 
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I, II/ and III have an early-onset disease and are 
homozygous for the (GCG) 9 expansion (P < 10'^) 
(Figs. 2B, F) . Cases IV, V, VI and VII have jnore severe 
phenotypes and are compound heterozygotes for the 
5 (GCG) 9 mutation and the (GCG) 7 polyntorph^m (P < 10"^) • 
In Fig. 2D the independent segregation of the two 
alleles is shown. Case V, who inherited the French 
Canadian (GCG) 9 mutation and the (GCG) 7 polymorphism, 
is more symptomatic than his brother VIII who carries 

10 the (GCG) 9 mutation and a normal (GCG) 6 allele 
(Figs. 2D and 3). The (GCG) 7 polymorphism thus appears 
to be a modifier of severity of dominant OPMD. Further- 
more, the (GCG) 7 allele can act as a recessive muta- 
tion. This was documented in the French patient IX who 

15 inherited two copies of the (GCG) 7 polymorphism and has 
a late-onset autosomal recessive form of OPMD 
(Fig. 2E) . Case IX, who has a recessive form of OPMD, 
is shown to have inherited two copies of the (GCG) 7 
po 1 ymo rph i sm . 

20 This is the first description of short trinu- 

cleotide repeat expansions causing a human disease- The 
addition of only two GCG repeats is sufficient to cause 
dominant OPMD. OPMD expansions do not share the cardi- 
nal features of "dynamic mutations" . The GCG expansions 

25 are not only short they are also meiotically quite sta- 
ble. Furthermore, there is a clear cut-off between the 
normal and abnormal alleles, a single GCG expansion 
causing a recessive phenotype . The PAB II (GCG) 7 allele 
is the first example of a relatively frequent allele 

30 which can act as either a modifier of a dominant pheno- 
type or as a recessive mutation. This dosage effect is 
reminiscent of the one observed in a homo zygote for two 
dominant synpolydactyly mutations. In this case, the 
patient had more severe deformities because she inher- 

35 ited two duplications causing an expansion in the 
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polyalanine tract of the HOXD13 protein (Akarsu, A.N. 
et al.. Hum. Mol. Genet. 5, 945-952 (1996)). A duplica- 
tion causing a similar polyalanine expansion in the a 
subunit 1 gene of the core -binding transcription factor 
5 (CBFal) has also been found to cause^^ominant cleido- 
cranial dysplasia (Mundlos, S. et al.. Cell 89, 773-779 
(1997) ) . The mutations in these two rare diseases are 
not triplet -repeats . The are duplications of "cryptic 
repeats" composed of mixed synonymous cbdons and are 

10 thought to result from unequal crossing over (Warren, 
S.T., Science 275 , 408-409 (1997)). In the case of 
OPMD, slippage during replication causing a reiteration 
of the GCG codon is a more likely mechanism (Wells, 
D.R., J. Biol, Chem, 271 . 2875-2878 (1996)). 

15 Different observations converge to suggest that 

a gain of function of PAB II may cause the accumulation 
of nuclear filaments observed in OPMD (Tome, F.M.S. & 
Fardeauv .^cta Neuropath. 49, 85-87 (1980)). PAB II is 
found mostly in dimeric and oligomeric foirm (Nemeth, A. 

20 et al.. Nucleic Acids Res. 23., 4034-4041 (1995)). It is 
possible that the polyalanine tract plays a role in 
polymerization. Polyalanine stretches have been found 
in many other nuclear proteins such as the HOX pro- 
teins, but their functions is still unknown (Davies, 

25 S.W. et al,, Cell 90, 537-548 (1997)). Alanine is a 
highly hydrophobic amino acid present in the cores of 
proteins. In dragline spider silk, polyalanine 
stretches are thought to form B- sheet structures impor- 
tant in ensuring the fibers' strength (Simmons, A.H. et 

30 al.. Science 271, 84-87 (1996)). Polyalanine oligomers 
have also . been shown to be extremely resistant to 
chemical denaturation and enzymatic degradation 
(Forood, B. et al . , Bioch. ajid Biophy. Res. Com. 211, 
7-13 (1995)). One can speculate that PAB II oligomers 

35 comprised of a sufficient number of mutated molecules 
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might accumulate in the nuclei by forming undegradable 
polyalanine rich macromolecules . The rate of the accu- 
mulation would then depend on the ratio pf'_mutated to 
non-mutated protein. The more severe phenotypes 
5 observed in homozygotes for the (GC'G) 5-^utations and 
compound heterozygotes for the (GCG) 9 mutation and 
(GCG) 7 allele may correspond to the fact that in these 
cases PAB II oligomers are composed only of mutated 
proteins. The ensuing faster filament accumulation 

10 could cause accelerated cell death. The recent descrip- 
tion of nuclear filament inclusions in Huntington's 
disease, raises the possibility that "nuclear toxicity" 
caused by the accumulation of mutated homopolymeric 
domains is involved in the molecular pathophysiology of 

15 other triplet-repeat diseases (Davies, S.W. et al . , 
Cell 90., 537-548 (1997); Scherzinger, E. et al . , Cell 
90 . 549-558 (1997); DiFiglia, M. et al . , Science 277 , 
1990-1993 (1997)). Future immunocytochemical and 
expression studies will be able to test this patho- 

20 physiological hypothesis and provide some insight into 
why certain muscle groups are more affected while all 
tissues express PAB II. 
Methods 

Contig and cDNA selection 

25 The cosmid contig was constructed by standard 

cosmid walking techniques using a gridded chromosome 
14-specific cosmid library (Evans, G.A. et al . , Gene 
79, 9-20 (1989)). The cDNA clones were isolated by cDNA 
selection as previously described (Rommens, J.M. et 

30 al . , in Proceedings of the third international workshop 
on the identification of transcribed sequences (eds. 
Hochgeschwender, U. & Gardiner, K.) 65-79 (Plenum, New 
York, 1994)). 

Cloning of the PAB II gene. Three cDNA clones 

35 corresponding to PAB II were sequenced (Sequenase, 
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USB) . Clones were verified to map to cosmids by South- 
ern hybridization- The 8 kb Hindlll restriction frag- 
ment was subcloned from cosmid 166G8 into pBluescriptll 
(SK) (Stratagene) . The clone was sequenced using prim- 
5 ers derived from the bPABII gene '-^d human EST 
sequences. Sequencing of the PAB II introns was done by 
primer walking. 

PAB II mutation screening and sequencing. All 
cases were diagnosed as having OPMD on clinical grounds 

10 (Brais, B. et al . , Hum. Mpl , Genet. 4, 429-434 (1995)). 
RT-PCR- and genomic SSCP analyses were done using stan- 
dard protocols (Lafreniere, R.G. et al . , Nat. Genet. 
15, 298-302 (1997) ) . The primers used to amplify the 
PAB II mutated region were: 5 » -CGCAGTGCCCCGCCTTAGA-3 • 

15 ( SEQ ID NO : 4 ) and 5 * - ACAAGATGGCGCCGCCGCCCCGGC - 3 ' ( SEQ 
ID NO:5) . PGR reactions were performed in a total 
volume of 15 ml containing: 40 ng of genomic DNA; 1.5 
mg of BSA; 1 mM of each primer; 2 50 mM dCTP and dTTP; 
25 mM dATP; 125 mM of dGTP and 12 5 mM of 7-deaza-dGTP 

20 (Pharmacia); 7.5% DMSO; 3.75 mCi [35S] dATP, 1.5 unit of 
Taq DNA polymerase and 1.5 mM MgCla (Perkin Elmer) . For 
non-radioactive PGR reactions the [35S]dATP was 
replaced by 225 mM of dATP. The amplification procedure 
consisted of an initial denaturation step at 95*^C for 

25 five minutes, followed by 3 5 cycles of denaturation at 
95^C for 15 s, annealing at 70*^G for 30 s, elongation 
at 74*^0 for 30 s and a final elongation at 74^G for 7 
min. Samples were loaded on 5% polyacrylamide denatur- 
ing gels. Following electrophoresis, gels were dried 

30 and autoradiographs were obtained. Sizes of the inserts 
were determined by comparing to a standard M13 sequence 
(Sequenase, USB). Fragments used for sequencing were 
gel -purified. Sequencing of the mutated fragment using 
the Amplicycle kit (Perkin Elmer) was done with the 5'- 
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CGCAGTGCCCCGCCTTAGAGGTG - 3 ' ( SEQ ID NO : 6 ) primer at an 
elongation temperature of GB^C, 

Stability of (GCG) -repeat expansion^^ The mei- 
otic stability of the (GCG) 9-repeat was estimated based 

5 on our large French Canadian OPMD cohort \r.^We previously 
established that a single ancestral OPMD carrier chro- 
mosome was introduced in the French Canadian population 
by three sisters in 1648. Seventy of the seventy one 
French Canadian OPMD families tested to date segregate 

10 a (GCG) 9 expansion. However, in family F151, the 
affected brother and sister, despite sharing the French 
Canadian ancestral haplotype, carry a (GCG) 12 expansion 
twice the size of the ancestral (GCG) 9 mutation 
(Fig. 2C) . In our founder effect study, we estimated 

15 that 450 (304-594) historical meioses shaped the 123 
OPMD cases belonging to 42 of the 71 enrolled families. 
Our screening of our full set of participants allowed 
us to identify another 148 (GCG) 9 carrier chromosomes. 
Therefore, we estimate that a single mutation of the 

20 (GCG) 9 expansion has occurred in 598 (452-742) meioses. 

Genotype -phenotype correlations. 176 carriers of 
at least one copy of the (GCG) 9 mutation were examined 
during the early stage of the linkage study. All were 
asked to swallow 80 cc of ice-cold water as rapidly as 

25 possible. Testing was stopped after 60 seconds. The 
swallowing time (st) was validated as a sensitive test 
to identify OPMD cases (Brais, B. et al . , Hum, Mai. 
Genet. 4, 429-434 (1995); Bouchard, J. -P. et al . , Can. 
J. Neurol, Sci , 19, 296-297 (1992)). The st values for 

30 76 (GCG) 6 homozygotes normal controls is illustrated in 
Fig. 3. Analyses of variance were computed by two-way 
ANOVA (SYSTAT package) . For the (GCG) 9 homozygotes 
their mean st value was compared to the mean value for 
all (GCG) 9 heterozygotes aged 35-40 (P < 10"^). For the 

35 (GCG) 9 and (GCG) 7 compound heterozygotes their mean st 
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value was compared to the mean value for all (GCG) 9 
heterozygotes aged 45-65 (P < 10*^) . 

While the invention has been described in con- 
nection with specific embodiments thereof, it will be 
understood that it is capable of furth^ modifications 
and this application is intended to cover any varia- 
tions, uses, or adaptations of the invention following, 
in general, the principles of the invention and 
including such departures from the present disclosure 
as come within known or customary practice within the 
art to which the invention pertains and as may be 
applied to the essential features hereinbefore set 
forth, and as follows in the scope of the appended 
claims. 
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WHAT IS CIAIMED ISj. 

1. A human PAB II gene containing transcribed poly- 
morphic GCG repeat, which comprises a sequence as set 
forth in SEQ ID NO:3, which includes" ih^ons and flank- 
ing genomic sequence. 

2. The gene of claim 1, wherein allelic variants of 
GCG repeat are associated with a disease related with 
protein accumulation in nucleus. 

3. The gene of claim 2, wherein said protein accu- 
mulation is polyalanine accumulation. 

4. The gene of claim 1, wherein allelic variants of 
GCG repeat are associated with a disease related with 
swallowing difficulties. 

5. The gene of claim 1, wherein said disease is 
oculopharyngeal muscular dystrophy. 

6. A method for the diagnosis of a disease with 
protein accumulation in nucleus, which comprises the 
steps of : 

a) obtaining a nucleic acid sample of said patient; 
and 

b) determining allelic variants of GCG repeat of 
the gene of claim 1, and wherein long allelic 
variants are indicative of a disease related 
with protein accumulation in nucleus. 



7. The method of claim 6, wherein said disease is 

oculopharyngeal muscular dystrophy. 
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8. The method of claim 7, wherein said long allelic 
variants have from about 245 to about 263 bp in length. 

9. A non-human mammal model for the PAB II gene of 
claim 1, whose germ cells and somaticHfells are modi- 
fied to express at least one allelic variant of the PAB 
II gene and wherein said allelic variant of the PAB II 
being introduced into the mammal, or an ancestor of the 
mammal, at an embryonic stage. 

10. A method for the screening of therapeutic agents 
for the prevention and/or treatment of oculopharyngeal 
muscular dystrophy, which comprises the steps of: 

a) administering said therapeutic agents to the 
non-human mammal of claim 9 or oculopharyngeal 
muscular dystrophy patients; and 

b) evaluating the prevention and/or treatment of 
development of oculopharyngeal muscular dystro- 
phy in said mammal or said patients. 

11. A method to identify genes part of or interact- 
ing with a biochemical pathway affected by PAB II gene, 
which comprises the steps of : ^ 

a) designing probes and/or primers using the hGTl 
gene of claim I and screening oculopharyngeal 
muscular dystrophy patients samples with said 
probes and/or primers; and 

b) evaluating the identified gene role in oculopha- 
ryngeal muscular dystrophy patients. 
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O 4-) 
03 4-> 

tn u 



4-3 

03 
o3 

o 

05 
4-) 
03 



o 

4-> 
4-3 
O 
05 
tn 
tn 



4-3 
4-> 
4-> 



05 
4-J 
03 



03 4-3 

03 03 

03 03 

03 o3 
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SEQUENCE LISTING 

<110> McGILL UNIVERSITY 

ROUIiEAU, Guy A. i 
BRAIS, Bernard - 

<120> SHORT GCG EXPANSIONS IN THE PAG II GENE 

FOR OCULOPHARYNGEAL MUSCULAR DYSTROPHY AND" B^GNOSTIC 
THEREOF 

<130> 1770-199PCT FC/ld 

<150> CA 2, 218, 199 
<151> 1997-12-09 

<160> 6 

<170> FastSEQ for Windows Version 3.0 

<210> 1 
<211> 57 
<212> DNA 

<213> Artificial Sequence 

<400> 1 

atggcggcgg cggcggcggc ggcagcagca atggcggcgg cggcggcggc ggcggca 57 

<210> 2 
<211> 35 
<212> DNA 

<213> Artificial Sequence 
<400> 2 

maaaaaaaaa agaaggrgsm aaaaaaaaaa agaag 35 
<210> 3 

<211> 6002 ^ 
<212> DNA 

<213> Artificial Sequence 
<400> 3 

aatgaaggtg gacacccaaa tagccccaat acaaatgcct gttcaatcaa ccaaacatct 60 
aagcagcaca tctatgtggt agcatattgc caggccgtga gactgcgaat ataaatagga 120 
accgcccctc atctgcaggc gctcacaacc tagttagcaa acagtaaaac aattaagcgc 180 
gccgtggaca taggcccact tgtcctggga aatgagggga agctggggtt tgcagtggtt 240 
tgattgaagg gggactacat gttagaggca cagactgggt gcaggtacac ccaaaggaac 300 
gagaagagtg gaaggaaaca acatccacaa agtaaccaca tgctggcgta tcgaaggccg 3 60 
tgatttacgg ttttgagact ttacctcgcc agcaaagggg ggccagtctg ttagcggtgc 420 
agattggagg ggtgacattg gaagctgtcc aggaaaaaga aaatggaact ggggagcaga 480 
aggcctacgc aagagggcgg gacagacagg acttgtgact agtagctctg gactgaggaa 540 
tcctccctgc tttctggtgc gggagagcta gtggatgatg gtgccaataa cctggatggg 600 
gaaagtaagc tccctcctgg aatgcttcat tcacaacctc cattttcagc aacatcccat 660 
ctactggtgc ttcctggtcg agatacaagt ttcctgaaac tgctgctctg ttttgggcct 720 
cacccggcca acagctcact agctggcaag cagtagtatc aagatggcgg ccccctagga 780 
ctggctagtc atgtgacctc gggtttccca agtttgaagc ccggcagtcc tttcgggggc . 84 0 
aaggttcacc tgtcacgaaa cgagtgtcac cccttcgact ctcgcaagcc aatcggcatc 900 
tgagactggg ccactgcggt gaggcgatcg gaagattggt cctttccagt cgcctagcta 960 
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gggccaatca cggagcgtcc catacttcgc gggcccgccc gtaggccggg gagaagcagg 102 0 

aatatcgtca cagcgtggcg gtattattac ctaaggactc gataggaggt gggacgcgtg 108 0 

ttgattgaca ggcagatttc cctaccggga tttgagaatt tggcgcagtg cccgccttag 114 0 

aggtgcgctt atttgattgc caagtaatat tccccaatgg agtactagct ^.catggtgacg 1200 

ggcaggcagc ttgagctaat gagtcctccg tggccggcgc agctctccac atgccgggcg 1260 

gcgggcccca gtctgagcgg cgatggcggc ggcggcggcg gcggcagcag cagcgggggc 132 0 

tgcgggcggt cggggctccg ggccggggcg gcggcgccat cttgtgcosg gggccggtgg 1380 

ggaggccggg gagggggccc cggggggcgc aggggactac gggaacg^c tggagtctga 1440 

ggaactggag cctgaggagc tgctgctgga gcccgagccg gagcccgagc ccgaagagga 1500 

gccgccccgg ccccgcgccc ccccgggagc tccgggccct gggcctggtt cgggagcccc 1560 

cggcagccaa gaggaggagg aggagccggg actggtcgag ggtgacccgg gggacggcgc 1620 

cattgaggac ccggtgagga aggagggcga gcgagcaggc cggcggctgg cgcgtcactg 1680 

gaggcccaga gctcgggcga gcggtggcag gcggggggtg gggttgggcg gggaataacg 174 0 

tggctggggc gggtcgggcc ggggatgggt cagcgatcac tacaaggggc ccgactggct 1800 

tgattcgggc gtcacgggtg cctagtgttg ttctagagag ggtagctttt cttttatcac 1860 

gaccctcgca tggggcgagg gaaatggccg agcatggctg aggcgcgctc tggccgagag 1920 

cagggcacag cccctgcgtt ggttcctctt aagctgtcct ccataccctc cccacttata 1980 

ttaggagctg gaagctatca aagctcgagt cagggagatg gaggaagaag ctgagaagct 204 0 

aaaggagcta cagaacgagg tagagaagca gatgaatatg agtccacctc caggcaatgc 2100 

tgagtaactg gcggttgcac gcggagcccg ggttctcggg ttggaagggt tgtggggagg 2160 

atggggaatg tggggttaga tactcggcac cctggagctg cttgtctgag ctattatgac 2220 

tgtgccgcgg tcatagtccg ttgtgtgttc ctctgacctt tgtgaggcag aactgatatt 2280 

ttggtggtgg tagccttgtg cctccctttg tcctgttata attgtgttgc tctttattct 2340 

tagtctacgt ctatctttct ttggtagagg ttgcgtgctc gcatttgacc ttcaaatcta 2400 

atagtttttc ctccaattgg agacgcttta ggattctaag agaaagcaag ctggaagggg 24 60 

tttccccttt aaattctaga aatgtggagt ctcagcccac ttaattttgc tcactcttaa 2520 

aagcatttca accaaagcca ttcattaggg atttgatttg gagggcagga gggattccta 2580 

tactgtttta agtgtgtatt aattctttca atttatcgaa ttatttagtg agtaacctgc 2640 

tatgcactag gcactattct cggcttgtgg gtacagcagg gaacagcaca gaccaaaatc 2700 

tttgccttca ctgagcttat gggatagtgc tggtggtgga agtgcaacat attggtcaag 2760 

tagaaaacaa gtgtgtggtt tttgtaaaaa attatttttt cctgatagct ggcccggtga 2820 

tcatgtccat tgaggagaag atggaggctg atgcccgttc catctatgtt ggcaatgtga 2880 

cgtactgggg ctctgactgg ggttgggggc aagttcttct tttggggaat tatttaatag 2940 

tcctgaaaga acatctccgg gatagatgtg gttttgggtg tggagggagt gtgggaagga 3000 

ggttaaaggc aatggaatga tcagtaatca gcaaaggctc tgggtttgga aggaaaagag 3060 

attaattcct caaattacca gatttcatgt gctttggtgt atgatggccc agaccaaagg 312 0 

ctcgggaggg ttcttttgag acaggaattt gcctggtgcc tgfegaaattt ttctcctctc 3180 

atcaggtgga ctatggtgca acagcagaag agctggaagc tcactttcat ggctgtggtt 324 0 

cagtcaaccg tgttaccata ctgtgtgaca aatttagtgg ccatcccaaa ggtaaagtaa 3300 

aggggagtaa gttgagataa tttaaattac agtgtacaaa tagataaatt atgttttata 3360 

ttgagcagta agttatttgg tgttaacaca ggtgatctgt gtcatttaag atcatggcat 3420 

taatgttgat atatcaggag ttgcacctaa atgtcttcag aggccagata acaaaaatga 34 80 

aggctagatg tgggtgggat tacgaactag aaggggaggg gcagcttcta cttggcctat 3540 

tatggcatat ggaaattcag gccctgtgtg tcttattttt acaaatttca aagagtagct 3600 

ggaaatttta aaatttaaat gatttcgaat gattgaaatt ttccatttag aagaattttg 3660 

acaaataaaa aatataactg cattgtagcc caaaacgaag catgcctgca ggttgaattt 3720 

gacctgtgag gtatttgtaa cctcagagag atacaatgac aattcttttc aggtttgcgt 3780 

atatagagtt ctcagacaaa gagtcagtga ggacttcctt ggccttagat gagtccctat 3840 

ttagaggaag gcaaatcaag gtaagcctat gtccattgct gttctagttg tgtataaact 3900 

ctccaggttg cctttaaggc tatcatttgt tcatctctga ctcaggtgat cccaaaacga 3960 

accaacagac caggcatcag cacaacagac cggggttttc cacgagcccg ctaccgcgcc 4020 

cggaccacca actacaacag ctcccgctct cgattctaca gtggttttaa cagcaggccc 4080 

cggggtcgcg tctacaggtc aggatagatg ggctgctcct ctttcccccg cctcccgtga 414 0 

gccccgtatg cttcctcctc tctggtctga ggaacctccc tccccccacc cctccccgtg 4200 

gtcttcagga actttgtctc ctgcctgtgc aggttgagga aggtagttgc aggccaggcc 4260 

agaaggcagc ctcatcatct tttctgcagt agaaattggt gataagggct gcatccctcc 4320 

cttggttcaa agaggcttcc acccccagcc ttttttttct tgggagttgg tggcatttga 4380 
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aggtgtttgc 

gacatttgtt 

ataccagagg 

attctttatt 

ggtttttgcc 

cttggctttt 

aggccctagg 

ccttgcccct 

tgttaagccc 

ggggtcggtt 

gcaggggcct 

tctccttctt 

aaagtgtgta 

taaaaaaaaa 

aaaagatata 

ttggggagta 

tcgccatgga 

ccccttgggc 

ggctctggaa 

tttcacagtc 

tttctccggc 

tctttttctg 

gctcccagcg 

ggggggttta 

tttgcctttt 
ggtggatttt 
gtcatgaata 



ggacaaaact 
acttttttcg 
ctagctagtt 
tgagccagtc 
ttgcttcact 
cataagctct: 
gtttaaaaac 
gtctctcact 
ccctccccct 
ttaggacact 
acggggaggg 
tcttccaggg 
ttaggaggag 
aaaaagaaaa 
ctgtggaagg 

ggggaaggcc 

cacgtctcaa 
ctgctcaagg 
ggacaccaaa 
ccctcctgcc 
tccctgcccc 
ttttgagtgt 
gctccagtgt 
ggggtgtttt 
ttccctttta 
gtttattttt 
aagttgtttt 



gggaggaaca 

gagttaggga 

gatcctccca 

ttgcaaggtt 

tctgtctcta 

acctgcctat 

tgtggaggac 

cagatgcgct 

gccccagttc 

tgaacacttc 

gcttgtactg 

gccgggctag 

agagaggaaa 

acagaagatg 

ggggagaatc 

cagggagtgg 

ctgcgcaagc 

gtaggtgggc 

ctgttctgct 

tgctcctgtc 

tccagattgc 

ctttctttgc 

aaattcccct 

tgtttttcag 

tttggaggga 

ttagctcatt 

tgaaaataaa 



gggcctccag 

gggattgaag 

acagccttgt 

aacttctcac 

catttaaata 

ccccaggagt 

tgaaaaactg 

tctttttcgc 

tcccaggtgc 

ttttcccccc 

aactatctag 

agcgacatca 

aaaagaggaa 

accttgatgg 

ccataactaa 

ggcagggggc 

tgcttgccca 

gtgggtggta 

tgttaccttc 

cagccaggtc 

ctggtgatct 

aggtttctgt 

tccccctggg 

ttgttttgtt 

atgggaggaa 

tccaggggtg 

aaaaaaaaaa 



gaagttigaaa 

actgaacctc 

gggaggattt 

tgggcctagt 

gacgggttag 

tagggaggat 

gataaaaago, 

cactgttt^ 

gttactattt 

ttcccttcac 

tgatcacgtt 

tggtattccc 

agaaggaaaa 

aaaaaaaata 

ctgctgagga 

tgcttattca 

tgtttccctg 

ggagggtttt 

cctcccgtct 

taccacccac 

attttgtttc 

agccggaaga 

gaaatgcact 

tttttgtttt 

gtgggaacag 

ggaatttttt 

aaaaaaaaaa 



gcactgcttg 

ccttggaaga 

tgagatactt 

gtggtnccca 

gcatataaac 

ctatttgtga 

gggtcctttt 

caaagttttc 

ctgggatcat 

agtaactggg 

aacacctaac 

cttactaaaa 

aaaaaagaat: 

ttttttaaaa 

gggacctgct 

ctctggggat 

cccccttcac 

ttttacccag 

tctcctcgcc 

cccacccctc 

cttttgtgtt 

tctccgttcc- 

accttgtttt 

ttttttttcc 

ggaggtggga 

tttaatatgt 

aaaaaaaaaa 



4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6002 



<210> 4 
<211> 19 
<212> DNA 

<213> Artificial Sequence 

<400> 4 
cgcagtgccc cgccttaga 

<210> 5 
<211> 24 
<212> DNA 

<213> Artificial Sequence 

<400> 5 
acaagatggc gccgccgccc cggc 

<210> 6 

<211> 23 

<212> DNA 

<213> Artificial Sequence 

<400> 6 
cgcagtgccc cgccttagag gtg 
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